98 research outputs found

    A data cube model for analysis of high volumes of ambient data

    Get PDF
    Ambient systems generate large volumes of data for many of their application areas with XML often the format for data exchange. As a result, large scale ambient systems such as smart cities require some form of optimization before different components can merge their data streams. In data warehousing, the cube structure is often used for optimizing the analytics process with more recent structures such as dwarf, providing new orders of magnitude in terms of optimizing data extraction. However, these systems were developed for relational data and as a result, we now present the development of an XML dwarf to manage ambient systems generating XML data

    Enrichment of raw sensor data to enable high-level queries

    Get PDF
    Sensor networks are increasingly used across various application domains. Their usage has the advantage of automated, often continuous, monitoring of activities and events. Ubiquitous sensor networks detect location of people and objects and their movement. In our research, we employ a ubiquitous sensor network to track the movement of players in a tennis match. By doing so, our goal is to create a detailed analysis of how the match progressed, recording points scored, games and sets, and in doing so, greatly reduce the eort of coaches and players who are required to study matches afterwards. The sensor network is highly efficient as it eliminates the need for manual recording of the match. However, it generates raw data that is unusable by domain experts as it contains no frame of reference or context and cannot be analyzed or queried. In this work, we present the UbiQuSE system of data transformers which bridges the gap between raw sensor data and the high-level requirements of domain specialists such as the tennis coach

    Pattern based processing of XPath queries

    Get PDF
    As the popularity of areas including document storage and distributed systems continues to grow, the demand for high performance XML databases is increasingly evident. This has led to a number of research eorts aimed at exploiting the maturity of relational database systems in order to in- crease XML query performance. In our approach, we use an index structure based on a metamodel for XML databases combined with relational database technology to facilitate fast access to XML document elements. The query process involves transforming XPath expressions to SQL which can be executed over our optimised query engine. As there are many dierent types of XPath queries, varying processing logic may be applied to boost performance not only to indi- vidual XPath axes, but across multiple axes simultaneously. This paper describes a pattern based approach to XPath query processing, which permits the execution of a group of XPath location steps in parallel

    Classification of index partitions to boost XML query performance

    Get PDF
    XML query optimization continues to occupy considerable research effort due to the increasing usage of XML data. Despite many innovations over recent years, XML databases struggle to compete with more traditional database systems. Rather than using node indexes, some efforts have begun to focus on creating partitions of nodes within indexes. The motivation is to quickly eliminate large sections of the XML tree based on the partition they occupy. In this research, we present one such partition index that is unlike current approaches in how it determines size and number of these partitions. Furthermore, we provide a process for compacting the index and reducing the number of node access operations in order to optimize XML queries

    An automated ETL for online datasets

    Get PDF
    While using online datasets for machine learning is commonplace today, the quality of these datasets impacts on the performance of prediction algorithms. One method for improving the semantics of new data sources is to map these sources to a common data model or ontology. While semantic and structural heterogeneities must still be resolved, this provides a well established approach to providing clean datasets, suitable for machine learning and analysis. However, when there is a requirement for a close to real time usage of online data, a method for dynamic Extract-Transform-Load of new sources data must be developed. In this work, we present a framework for integrating online and enterprise data sources, in close to real time, to provide datasets for machine learning and predictive algorithms. An exhaustive evaluation compares a human built data transformation process with our system’s machine generated ETL process, with very favourable results, illustrating the value and impact of an automated approach

    Capturing personal health data from wearable sensors

    Get PDF
    Recently, there has been a significant growth in pervasive computing and ubiquitous sensing which strives to develop and deploy sensing technology all around us. We are also seeing the emergence of applications such as environmental and personal health monitoring to leverage data from a physical world. Most of the developments in this area have been concerned with either developing the sensing technologies, or the infrastructure (middleware) to gather this data and the issues which have been addressed include power consumption on the devices, security of data transmission, networking challenges in gathering and storing the data and fault tolerance in the event of network and/or device failure. Research is focusing on harvesting and managing data and providing query capabilities

    Desirable properties for XML update mechanisms

    Get PDF
    The adoption of XML as the default data interchange format and the standardisation of the XPath and XQuery languages has resulted in significant research in the development and implementation of XML databases capable of processing queries efficiently. The ever-increasing deployment of XML in industry and the real-world requirement to support efficient updates to XML documents has more recently prompted research in dynamic XML labelling schemes. In this paper, we provide an overview of the recent research in dynamic XML labelling schemes. Our motivation is to define a set of properties that represent a more holistic dynamic labelling scheme and present our findings through an evaluation matrix for most of the existing schemes that provide update functionality

    An automated ETL for online datasets

    Get PDF
    While using online datasets for machine learning is commonplace today, the quality of these datasets impacts on the performance of prediction algorithms. One method for improving the semantics of new data sources is to map these sources to a common data model or ontology. While semantic and structural heterogeneities must still be resolved, this provides a well established approach to providing clean datasets, suitable for machine learning and analysis. However, when there is a requirement for a close to real time usage of online data, a method for dynamic Extract-Transform-Load of new sources data must be developed. In this work, we present a framework for integrating online and enterprise data sources, in close to real time, to provide datasets for machine learning and predictive algorithms. An exhaustive evaluation compares a human built data transformation process with our system’s machine generated ETL process, with very favourable results, illustrating the value and impact of an automated approach

    HealthSense: an application for querying raw sensor data

    Get PDF
    New sensing technologies and the decreasing cost of Information and Communication Technologies (ICTs) make possible the development of electronic Health (eHealth) monitoring systems. The challenges of such systems include the representation of data extracted from various sensor devices by knowledge workers through semantic enrichment and integration. Also, the data must be stored in a format suitable for querying and further analysis. This paper describes the demonstration of the HealthSense system which captures and queries personal health data extracted from wearable sensors

    Extracting tennis statistics from wireless sensing environments

    Get PDF
    Creating statistics from sporting events is now widespread with most eorts to automate this process using various sensor devices. The problem with many of these statistical applications is that they require proprietary applications to process the sensed data and there is rarely an option to express a wide range of query types. Instead, applications tend to contain built-in queries with predened outputs. In the research presented in this paper, data from a wireless network is converted to a structured and highly interoperable format to facilitate user queries by expressing high level queries in a standard database language and automatically generating the results required by coaches
    corecore